Sound event detection in domestic environments with weakly labeled data and soundscape synthesis

Published in Detection and Classification of Acoustic Scenes and Events 2019, 2019

This paper presents Task 4 of the Detection and Classification of Acoustic Scenes and Events (DCASE) 2019 challenge and provides a first analysis of the challenge results. The task is a follow-up to Task 4 of DCASE 2018, and involves training systems for large-scale detection of sound events using a combination of weakly labeled data, i.e.~training labels without time boundaries, and strongly-labeled synthesized data. We introduce the Domestic Environment Sound Event Detection (DESED) dataset, mixing a part of last year's dataset and an additional synthetic, strongly labeled, dataset provided this year that we describe in more detail. We also report the performance of the submitted systems on the official evaluation (test) and development sets as well as several additional datasets. The best systems from this year outperform last year's winning system by about 10\% points in terms of F-measure.

Citation: Nicolas Tarpault, Romain Serizel, Ankit Shah, Justin Salamon, “Sound event detection in domestic environments with weakly labeled data and soundscape synthesis”, Detection and Classification of Acoustic Scenes and Events 2019